AITopics | nar model

Collaborating Authors

nar model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

5a016da670821af25f151f523a2e563f-Paper-Conference.pdf

Neural Information Processing SystemsFeb-14-2026, 05:47:01 GMT

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.04)
Asia > Singapore (0.04)
Asia > Indonesia > Bali (0.04)
Asia > China (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
(2 more...)

Add feedback

5a016da670821af25f151f523a2e563f-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 03:33:21 GMT

codec language model, dataset, language model, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.04)
Asia > Singapore (0.04)
Asia > Indonesia > Bali (0.04)
Asia > China (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
(2 more...)

Add feedback

DYNAC: Dynamic Vocabulary based Non-Autoregressive Contextualization for Speech Recognition

Sudo, Yui, Fukumoto, Yosuke, Shakeel, Muhammad, Peng, Yifan, Lin, Chyi-Jiunn, Watanabe, Shinji

arXiv.org Artificial IntelligenceJun-4-2025

Contextual biasing (CB) improves automatic speech recognition for rare and unseen phrases. Recent studies have introduced dynamic vocabulary, which represents context phrases as expandable tokens in autoregressive (AR) models. This method improves CB accuracy but with slow inference speed. While dynamic vocabulary can be applied to non-autoregressive (NAR) models, such as connectionist temporal classification (CTC), the conditional independence assumption fails to capture dependencies between static and dynamic tokens. This paper proposes DYNAC (Dynamic Vocabulary-based NAR Contextualization), a self-conditioned CTC method that integrates dynamic vocabulary into intermediate layers. Conditioning the encoder on dynamic vocabulary, DYNAC effectively captures dependencies between static and dynamic tokens while reducing the real-time factor (RTF). Experimental results show that DYNAC reduces RTF by 81% with a 0.1-point degradation in word error rate on the LibriSpeech 960 test-clean set.

artificial intelligence, machine learning, speech recognition, (18 more...)

arXiv.org Artificial Intelligence

2506.00422

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Algorithm-Informed Graph Neural Networks for Leakage Detection and Localization in Water Distribution Networks

Zhang, Zepeng, Fink, Olga

arXiv.org Artificial IntelligenceAug-5-2024

Detecting and localizing leakages is a significant challenge for the efficient and sustainable management of water distribution networks (WDN). Leveraging the inherent graph structure of WDNs, recent approaches have used graph-based data-driven methods. However, these methods often learn shortcuts that work well with in-distribution data but fail to generalize to out-of-distribution data. To address this limitation and inspired by the perfect generalization ability of classical algorithms, we propose an algorithm-informed graph neural network (AIGNN). Recognizing that WDNs function as flow networks, incorporating max-flow information can be beneficial for inferring pressures. In the proposed framework, we first train AIGNN to emulate the Ford-Fulkerson algorithm for solving max-flow problems. This algorithmic knowledge is then transferred to address the pressure estimation problem in WDNs. Two AIGNNs are deployed, one to reconstruct pressure based on the current measurements, and another to predict pressure based on previous measurements. Leakages are detected and localized by comparing the outputs of the reconstructor and the predictor. By pretraining AIGNNs to reason like algorithms, they are expected to extract more task-relevant and generalizable features. Experimental results demonstrate that the proposed algorithm-informed approach achieves superior results with better generalization ability compared to GNNs that do not incorporate algorithmic knowledge.

aignn, algorithm, processor, (13 more...)

arXiv.org Artificial Intelligence

2408.02797

Country:

North America > United States > California > Los Angeles County > Santa Monica (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)

Genre:

Workflow (0.92)
Overview (0.67)
Research Report > New Finding (0.34)

Industry: Water & Waste Management > Water Management > Water Supplies & Services (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Reinforcement Learning for Edit-Based Non-Autoregressive Neural Machine Translation

Wang, Hao, Morimura, Tetsuro, Honda, Ukyo, Kawahara, Daisuke

arXiv.org Artificial IntelligenceJul-2-2024

Non-autoregressive (NAR) language models are known for their low latency in neural machine translation (NMT). However, a performance gap exists between NAR and autoregressive models due to the large decoding space and difficulty in capturing dependency between target words accurately. Compounding this, preparing appropriate training data for NAR models is a non-trivial task, often exacerbating exposure bias. To address these challenges, we apply reinforcement learning (RL) to Levenshtein Transformer, a representative edit-based NAR model, demonstrating that RL with self-generated data can enhance the performance of edit-based NAR models. We explore two RL approaches: stepwise reward maximization and episodic reward maximization. We discuss the respective pros and cons of these two approaches and empirically verify them. Moreover, we experimentally investigate the impact of temperature setting on performance, confirming the importance of proper temperature setting for NAR models' training.

maximization, opération, reward maximization, (16 more...)

arXiv.org Artificial Intelligence

2405.0128

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > Maryland > Baltimore (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

NARRepair: Non-Autoregressive Code Generation Model for Automatic Program Repair

Yang, Zhenyu, Yang, Zhen, Yu, Zhongxing

arXiv.org Artificial IntelligenceJun-24-2024

With the advancement of deep learning techniques, the performance of Automatic Program Repair(APR) techniques has reached a new level. Previous deep learning-based APR techniques essentially modified program sentences in the Autoregressive(AR) manner, which predicts future values based on past values. Due to the manner of word-by-word generation, the AR-based APR technique has a huge time delay. This negative consequence overshadows the widespread adoption of APR techniques in real-life software development. To address the issue, we aim to apply the Non-Autoregressive(NAR) method to the APR task, which can output target code in a parallel manner to avoid huge inference delays. To effectively adapt the NAR manner for the APR task, we in this paper propose NARRepair, the first customized NAR code generation model for the APR task. The NARRepair features three major novelties, including 1) using repair actions to alleviate the over-correction issue, 2) extracting dependency information from AST to alleviate the issue of lacking inter-word dependency information, 3) employing two-stage decoding to alleviate the issue of lacking contextual information. We evaluated NARRepair on three widely used datasets in the APR community, and the results show that our technique can significantly improve the inference speed while maintaining high repair accuracy.

inference speed, information, narrepair model, (14 more...)

arXiv.org Artificial Intelligence

2406.16526

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
(10 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers

Chen, Sanyuan, Liu, Shujie, Zhou, Long, Liu, Yanqing, Tan, Xu, Li, Jinyu, Zhao, Sheng, Qian, Yao, Wei, Furu

arXiv.org Artificial IntelligenceJun-17-2024

This paper introduces VALL-E 2, the latest advancement in neural codec language models that marks a milestone in zero-shot text-to-speech synthesis (TTS), achieving human parity for the first time. Based on its predecessor, VALL-E, the new iteration introduces two significant enhancements: Repetition Aware Sampling refines the original nucleus sampling process by accounting for token repetition in the decoding history. It not only stabilizes the decoding but also circumvents the infinite loop issue. Grouped Code Modeling organizes codec codes into groups to effectively shorten the sequence length, which not only boosts inference speed but also addresses the challenges of long sequence modeling. Our experiments on the LibriSpeech and VCTK datasets show that VALL-E 2 surpasses previous systems in speech robustness, naturalness, and speaker similarity. It is the first of its kind to reach human parity on these benchmarks. Moreover, VALL-E 2 consistently synthesizes high-quality speech, even for sentences that are traditionally challenging due to their complexity or repetitive phrases. The advantages of this work could contribute to valuable endeavors, such as generating speech for individuals with aphasia or people with amyotrophic lateral sclerosis.

arxiv preprint arxiv, speech, vall-e 2, (12 more...)

arXiv.org Artificial Intelligence

2406.0537

Country: Asia > Middle East > Israel (0.04)

Genre: Research Report (0.82)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Synthesis (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Scaling the Vocabulary of Non-autoregressive Models for Efficient Generative Retrieval

Valluri, Ravisri, Mohankumar, Akash Kumar, Dave, Kushal, Singh, Amit, Jiao, Jian, Varma, Manik, Sinha, Gaurav

arXiv.org Artificial IntelligenceJun-10-2024

Generative Retrieval introduces a new approach to Information Retrieval by reframing it as a constrained generation task, leveraging recent advancements in Autoregressive (AR) language models. However, AR-based Generative Retrieval methods suffer from high inference latency and cost compared to traditional dense retrieval techniques, limiting their practical applicability. This paper investigates fully Non-autoregressive (NAR) language models as a more efficient alternative for generative retrieval. While standard NAR models alleviate latency and cost concerns, they exhibit a significant drop in retrieval performance (compared to AR models) due to their inability to capture dependencies between target tokens. To address this, we question the conventional choice of limiting the target token space to solely words or sub-words. We propose PIXAR, a novel approach that expands the target vocabulary of NAR models to include multi-word entities and common phrases (up to 5 million tokens), thereby reducing token dependencies. PIXAR employs inference optimization strategies to maintain low inference latency despite the significantly larger vocabulary. Our results demonstrate that PIXAR achieves a relative improvement of 31.0% in MRR@10 on MS MARCO and 23.2% in Hits@5 on Natural Questions compared to standard NAR models with similar latency and cost.

nar model, pixar, target vocabulary, (13 more...)

arXiv.org Artificial Intelligence

2406.06739

Country:

North America > United States > Iowa > Polk County > Des Moines (0.05)
Asia > India (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
(2 more...)

Add feedback

SpeechAlign: Aligning Speech Generation to Human Preferences

Zhang, Dong, Li, Zhaowei, Li, Shimin, Zhang, Xin, Wang, Pengyu, Zhou, Yaqian, Qiu, Xipeng

arXiv.org Artificial IntelligenceApr-8-2024

Speech language models have significantly advanced in generating realistic speech, with neural codec language models standing out. However, the integration of human feedback to align speech outputs to human preferences is often neglected. This paper addresses this gap by first analyzing the distribution gap in codec language models, highlighting how it leads to discrepancies between the training and inference phases, which negatively affects performance. Then we explore leveraging learning from human feedback to bridge the distribution gap. We introduce SpeechAlign, an iterative self-improvement strategy that aligns speech language models to human preferences. SpeechAlign involves constructing a preference codec dataset contrasting golden codec tokens against synthetic tokens, followed by preference optimization to improve the codec language model. This cycle of improvement is carried out iteratively to steadily convert weak models to strong ones. Through both subjective and objective evaluations, we show that SpeechAlign can bridge the distribution gap and facilitating continuous self-improvement of the speech language model. Moreover, SpeechAlign exhibits robust generalization capabilities and works for smaller models. Code and models will be available at https://github.com/0nutation/SpeechGPT.

codec language model, human feedback, language model, (16 more...)

arXiv.org Artificial Intelligence

2404.056

Country:

North America > Canada > Ontario > Toronto (0.04)
Asia > Singapore (0.04)
Asia > Indonesia > Bali (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Non-autoregressive Sequence-to-Sequence Vision-Language Models

Shi, Kunyu, Dong, Qi, Goncalves, Luis, Tu, Zhuowen, Soatto, Stefano

arXiv.org Artificial IntelligenceMar-4-2024

Sequence-to-sequence vision-language models are showing promise, but their applicability is limited by their inference latency due to their autoregressive way of generating predictions. We propose a parallel decoding sequence-to-sequence vision-language model, trained with a Query-CTC loss, that marginalizes over multiple inference paths in the decoder. This allows us to model the joint distribution of tokens, rather than restricting to conditional distribution as in an autoregressive model. The resulting model, NARVL, achieves performance on-par with its state-of-the-art autoregressive counterpart, but is faster at inference time, reducing from the linear complexity associated with the sequential generation of tokens to a paradigm of constant time joint inference.

decoder, output sequence, sequence, (17 more...)

arXiv.org Artificial Intelligence

2403.02249

Country: South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)

Add feedback